Order-Preserving Incomplete Suffix Trees and Order-Preserving Indexes

نویسندگان

  • Maxime Crochemore
  • Costas S. Iliopoulos
  • Tomasz Kociumaka
  • Marcin Kubica
  • Alessio Langiu
  • Solon P. Pissis
  • Jakub Radoszewski
  • Wojciech Rytter
  • Tomasz Walen
چکیده

Recently Kubica et al. (Inf. Process. Let., 2013) and Kim et al. (submitted to Theor. Comp. Sci.) introduced order-preserving pattern matching: for a given text the goal is to find its factors having the same “shape” as a given pattern. The known results include a linear-time algorithm for this problem (in case of polynomially-bounded alphabet) and a generalization to multiple patterns. We extend these results and give an O(n log log n) time construction of an index that enables orderpreserving pattern matching queries in time linear with respect to the length of the pattern. The main novel component is a data structure being an incomplete suffix tree in the order-preserving setting. The tree can miss single letters related to branchings at internal nodes. Such incompleteness results from the weakness of our so called weak character oracle. However, due to its weakness, such oracle can be computed in O(log logn) time on-line using a sliding-window approach. For most of the applications such incomplete suffix-trees provide the same functional power as the complete ones. We also give an O( n logn log logn ) time algorithm constructing complete order-preserving suffix trees. ? Supported by the NSF–funded iPlant Collaborative (NSF grant #DBI-0735191). ?? Supported by grant no. N206 566740 of the National Science Centre.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Order-Preserving Suffix Trees and Their Algorithmic Applications

Recently Kubica et al. (Inf. Process. Let., 2013) and Kim et al. (submitted to Theor. Comp. Sci.) introduced order-preserving pattern matching. In this problem we are looking for consecutive substrings of the text that have the same “shape” as a given pattern. These results include a linear-time order-preserving pattern matching algorithm for polynomially-bounded alphabet and an extension of th...

متن کامل

Robust Fixed-order Gain-scheduling Autopilot Design using State-space Stability-Preserving Interpolation

In this paper, a robust autopilot is proposed using stable interpolation based on Youla parameterization. The most important condition of stable interpolation between local controllers is the preservation of stability so that each local controller can ensure stability for an open neighborhood around a nominal point. The proposed design used fixed-order robust controller with parameter-dependent...

متن کامل

Entropy-Compressed Indexes for Multidimensional Pattern Matching

In this talk, we will discuss the challenges involved in developing a multidimensional generalizations of compressed text indexing structures. These structures depend on some notion of Burrows-Wheeler transform (BWT) for multiple dimensions, though naive generalizations do not enable multidimensional pattern matching. We study the 2D case to possibly highlight combinatorial properties that do n...

متن کامل

Optimum-width upward order-preserving poly-line drawings of trees

An upward drawing of a tree is a drawing such that no parents are below their children. It is order-preserving if the edges to children appear in prescribed order around each vertex. Chan showed that any tree has an upward order-preserving drawing with width O(logn). In this paper, we consider upward order-preserving drawings where edges are allowed to have bends. We present a linear-time algor...

متن کامل

FMtree: a fast locating algorithm of FM-indexes for genomic data

Motivation As a fundamental task in bioinformatics, searching for massive short patterns over a long text has been accelerated by various compressed full-text indexes. These indexes are able to provide similar searching functionalities to classical indexes, e.g. suffix trees and suffix arrays, while requiring less space. For genomic data, a well-known family of compressed full-text indexes, cal...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013